431 Class 24

Thomas E. Love, Ph.D.

2023-11-30

Today’s Agenda

  • The here() package
  • Turning values like 77 and 99 into NA and vice versa
  • McNemar’s test for paired samples comparisons of proportions
  • Two more power calculation examples
  • ChatGPT and Large Language Models: Works in progress

Today’s Packages

library(here)
library(naniar)
library(janitor)
library(broom)
library(gt)
library(epibasix) ## for mcNemar function
library(exact2x2) ## for exact2x2 function for McNemar
library(xfun) ## for session_info()
library(tidyverse)

theme_set(theme_bw())

The here() package

The here() package

https://here.r-lib.org/

The here package creates paths relative to the top-level directory. The package displays the top-level of the current project on load or any time you call here():

here()
[1] "D:/Teaching/431/2023-431/431-slides-2023"

The here() package

Jenny Bryan’s Ode to the here package

Jenny Bryan on Project-oriendted workflow

Using R projects and the here package

How can you avoid setwd() at the top of every script?

  1. Organize each logical project into a folder on your computer.
  2. Make sure the top-level folder advertises itself as such. If you use RStudio and/or Git, those both leave characteristic files that get the job done.
  3. Use the here() function to build the path when you read or write a file. Create paths relative to the top-level directory.
  4. Whenever you work on this project, launch the R process from the project’s top-level directory.

Creating / Replacing Missing Values

Turning values like 77 and 99 into NA

Suppose we have the following small data set, where 77 = “Refused” and 99 = “No Response” or some other term that we want to think of as “missing”.

var1 <- c(20, 22, 35, 19, 77, 99)
var2 <- c(1, 3, 4, 77, 6, 99)
var3 <- c("Yes", "No", 77, 99, "No", "Yes")
dat <- tibble(var1, var2, var3) |> mutate(var3 = factor(var3))

miss_var_summary(dat)
# A tibble: 3 × 3
  variable n_miss pct_miss
  <chr>     <int>    <dbl>
1 var1          0        0
2 var2          0        0
3 var3          0        0

How can we convince R the 77s and 99s are missing values?

Use replace_with_na() from naniar

dat1 <- dat |>
  replace_with_na(
    replace = list(var1 = c(77, 99), var2 = c(77, 99),
                   var3 = c(77, 99)))
miss_var_summary(dat1)
# A tibble: 3 × 3
  variable n_miss pct_miss
  <chr>     <int>    <dbl>
1 var1          2     33.3
2 var2          2     33.3
3 var3          2     33.3

More on replace_with_na() here

Replacing 77 and 99 with NA across all variables

dat2 <- dat |>
  replace_with_na_all(
    condition = ~.x %in% c(77, 99))
miss_var_summary(dat2)
# A tibble: 3 × 3
  variable n_miss pct_miss
  <chr>     <int>    <dbl>
1 var1          2     33.3
2 var2          2     33.3
3 var3          2     33.3

Other ways to extend replace_with_na() are described here

What if we have the opposite issue?

The replace_na() function from the tidyr package (details here) replaces an NA value with a specified value.

In that sense, it is the compliment to the replace_with_na() function.

Demo: Replacing NA with a value

df <- tibble(x = c(1, 2, NA), y = c("a", NA, "b"))
df
# A tibble: 3 × 2
      x y    
  <dbl> <chr>
1     1 a    
2     2 <NA> 
3    NA b    
df1 <- df |> replace_na(list(x = 0, y = "unknown"))
df1
# A tibble: 3 × 2
      x y      
  <dbl> <chr>  
1     1 a      
2     2 unknown
3     0 b      

More on replace_na() here

Comparing Proportions with Paired Samples

Paired Samples: Comparing Proportions

Suppose we wish to investigate the association between retirement status and heart disease. One concern might be the age of the subjects: an older person is both more likely to be retired and more likely to have heart disease. In one study, therefore, 127 victims of cardiac arrest were matched on a number of characteristics that included age with 127 healthy control subjects: retirement status was then ascertained for each subject, with the results shown on the next slide.

Data from 254 subjects

Retired Not Retired ALL
Healthy Control 39 88 127
Cardiac Arrest 47 80 127
ALL 86 168 254

We might think to use our usual twobytwo() function, but each matched pair in this study provides two responses, one for the Healthy Control and one for the Cardiac Arrest.

  • Pearson \(\chi^2\) test doesn’t take this pairing into account.
  • Our analysis must take this pairing into account.

Data from 127 matched pairs

Note

CA = Cardiac Arrest, HC = Healthy Control

CA, Retired CA, Not Retired Total
HC, Retired 27 12 39
HC, Not Retired 20 68 88
Total 47 80 127

The relevant data are stored for you in the retire.csv file.

The retire tibble

retire <- read_csv(here("c24", "data", "retire.csv")) |>
  mutate(healthy_control = 
           fct_relevel(factor(healthy_control), "Retired", "Not Retired"),
         cardiac_arrest = 
           fct_relevel(factor(cardiac_arrest), "Retired", "Not Retired"))

retire |> tabyl(healthy_control, cardiac_arrest) |>
  adorn_totals(where = c("row", "col")) |> adorn_title()
                 cardiac_arrest                  
 healthy_control        Retired Not Retired Total
         Retired             27          12    39
     Not Retired             20          68    88
           Total             47          80   127
  • Goal: Estimate the relative odds of being retired for healthy individuals versus those who have experienced cardiac arrest.

What are we testing?

CA, Retired CA, Not Retired
HC, Retired 27 12
HC, Not Retired 20 68

\(H_0\): There are equal numbers of pairs in which the healthy control retired and the matched individual who had a cardiac arrest did not retire, and in which the healthy control did not retire but the matched individual who had a cardiac arrest did retire.

  • or \(H_0\): There is no association between health status and retirement.

Concordant vs. Discordant Pairs

CA, Retired CA, Not Retired
HC, Retired 27 12
HC, Not Retired 20 68

The concordant pairs - or the pairs when two retired people or two non-retired people are matched - provide no information for testing a null hypothesis about differences in retirement status.

Thus, we focus only on the discordant pairs - or the pairs where a person who retires is paired with a person who has not retired.

The McNemar Odds Ratio

CA, Retired CA, Not Retired
HC, Retired 27 12
HC, Not Retired 20 68

The McNemar odds ratio is just the ratio of the two discordant pairs, here: 12/20 = 0.60, which indicates the association’s strength – the odds of retiring for healthy controls are estimated to be 0.6 times as high as the odds of retiring for those with cardiac arrest.

McNemar’s test

\(H_0\): No association between retirement status and cardiac arrest

table(retire$healthy_control, retire$cardiac_arrest) |>
  mcnemar.test() 

    McNemar's Chi-squared test with continuity correction

data:  table(retire$healthy_control, retire$cardiac_arrest)
McNemar's chi-squared = 1.5312, df = 1, p-value = 0.2159

Without continuity correction, and tidied…

table(retire$healthy_control, retire$cardiac_arrest) |>
  mcnemar.test(correct = FALSE) |> tidy() |> gt()
statistic p.value parameter method
2 0.1572992 1 McNemar's Chi-squared test

McNemar’s test via epibasix package (includes odds ratio)

The mcNemar() function comes from the epibasix package.

table(retire$healthy_control, retire$cardiac_arrest) |>
  mcNemar(alpha = 0.1)

Matched Pairs Analysis: McNemar's Chi^2 Statistic and Odds Ratio
 
McNemar's Chi^2 Statistic (corrected for continuity) = 1.531 which has a p-value of: 0.216
 
McNemar's Odds Ratio (b/c): 0.6
90% Confidence Limits for the OR are: [0.28, 1.134]

McNemar’s test is typically valid when the number of discordant pairs exceeds 30.

McNemar test via exact2x2() (also includes odds ratio)

  • The exact2x2 package can also do this, using a slightly different approximation for the confidence interval.
exact2x2(retire$healthy_control, retire$cardiac_arrest, paired = TRUE,
         conf.int = TRUE, conf.level = 0.90)

    Exact McNemar test (with central confidence intervals)

data:  retire$healthy_control and retire$cardiac_arrest
b = 12, c = 20, p-value = 0.2153
alternative hypothesis: true odds ratio is not equal to 1
90 percent confidence interval:
 0.3031389 1.1534984
sample estimates:
odds ratio 
       0.6 

More Involved Power Calculations That Still Use Balanced Designs

Example 1 Setup (1/2)

In a double-blind trial, patients with active rheumatoid arthritis will be randomly assigned to receive one of two therapy types: a cheaper one, or a pricier one.

The proposed primary analysis will estimate the difference in the proportion of participants who had a DAS28 of 3.2 or less at week 48. The DAS28 is a composite index of the number of swollen and tender joints, the erythrocyte sedimentation rate, and a visual-analogue scale of patient-reported disease activity.

Example 1 Setup (2/2)

The study’s power analysis established a sample size target of 225 completed enrollments in each therapy group, based on a two-sided 10% significance level, and a desire for 90% power, assuming the proportion of participants with a DAS28 of 3.2 or less at week 48 was 0.27 under the less effective therapy.

What value was used in the power calculation for the proportion of participants with DAS28 of 3.2 or less at week 48 for the more effective therapy? Round your response to two decimal places.

Example 1: What code do we need?

  • Are we comparing means or proportions?
  • Paired or independent samples?
  • What do we want to estimate? Necessary sample size? Power? Something else?
  • What information is available to us?

Example 1 Code

power.prop.test(n = 225, p1 = 0.27, sig.level = 0.1, power = .9)

     Two-sample comparison of proportions power calculation 

              n = 225
             p1 = 0.27
             p2 = 0.3996866
      sig.level = 0.1
          power = 0.9
    alternative = two.sided

NOTE: n is number in *each* group

Rounding to two decimal places, p2 was 0.40

Example 2 Setup (1/4)

An investigator plans to replicate part of a study of the gut hormone fragment peptide \(YY_{3-36}\) (PYY) which reduces appetite and food intake when infused into subjects of normal weight. The original study is found here.

In common with the adipocyte hormone leptin, PYY reduces food intake by modulating appetite circuits in the hypothalamus. However, in obesity there is a marked resistance to the action of leptin, which greatly limits its therapeutic effectiveness.

Example 2 Setup (2/4)

The investigator wants to know whether obese subjects are also resistant to the anorectic effects of PYY. She intends to perform a randomized, placebo-controlled, double-blind crossover study on healthy obese subjects (including an equal number of male subjects and female subjects), with each subject studied on two occasions one week apart.

The subjects will be screened by a dietitian who will assess their eating behavior with (several established scales).

Example 2 Setup (3/4)

On two consecutive Thursdays, we will measure caloric intake during a buffet lunch offered two hours after the infusion of that week’s exposure. In one of the weeks, the subject will receive an infusion of PYY, and in the other week (with the order of the weeks determined at random) the subject will receive a placebo. The number of calories consumed at each lunch is measured and then converted to an appetite rating. Our primary outcome is the difference between the appetite rating after PYY and the appetite rating after placebo.

Example 2 Setup (4/4)

A clinically meaningful difference, the investigator tells you, would be one in which these comparisons would differ by 30 or more points on the appetite rating scale comparing the two infusions, which is 60% of the anticipated standard deviation of these results. The investigator then asks whether a study which gathers a total of 64 observations will yield at least 90% power to detect an effect of this size using a 5% two-tailed significance level, and to meet all other requirements described above.

Example 2: What code do we need?

  • Are we comparing means or proportions?
  • Paired or independent samples?
  • What do we want to estimate? Necessary sample size? Power? Something else?
  • Desired significance level \(\alpha\) = 0.05 (two-tailed)
  • Sample Size is 64 observations (or 32 pairs)
  • Delta is ?
  • Standard Deviation is ?

Example 2 Code

power.t.test(n = 32, delta = 30, sd = 50, sig.level = 0.05, type = "paired")

     Paired t test power calculation 

              n = 32
          delta = 30
             sd = 50
      sig.level = 0.05
          power = 0.9077895
    alternative = two.sided

NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs

What is the proper conclusion? Do we have at least 90% power?

ChatGPT and Large Language Models: Taking Stock

ChatGPT

https://chat.openai.com/

ChatGPT response March 2023

ChatGPT 3.5 response November 2023

ASA Statement (2023-08-04)

ASA Statement Excerpt 1

The Role of Statistics in Data Science and Artificial Intelligence (pdf): 2023-08-04

Statistics plays a central role in data science and AI, especially in the areas of ML and deep learning. Framing questions statistically allows leveraging data resources to extract knowledge and obtain better answers. The central dogma of statistical inference, that there is a component of randomness in data, enables researchers to formulate questions in terms of underlying processes, quantify uncertainty in their answers, and separate signal from noise. A statistical framework allows researchers to distinguish between causation and correlation, and thus to identify interventions that will cause changes in outcomes.

ASA Statement Excerpt 2

The Role of Statistics in Data Science and Artificial Intelligence (pdf): 2023-08-04

The future of data science and AI is uncertain, but one thing is clear: the field will undoubtedly continue to evolve rapidly and have a profound impact on society. As a result, the role of statisticians will also be subject to change and expansion, as they must adapt to new technologies and tools while continuing to provide expertise in traditional areas of statistics such as uncertainty quantification, sampling design, and causal inference. The need for interdisciplinary collaboration and a diverse range of skills will become increasingly important for statisticians to remain relevant in this dynamic and ever-changing field.

Session Information

R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22621)


Locale:
  LC_COLLATE=English_United States.utf8 
  LC_CTYPE=English_United States.utf8   
  LC_MONETARY=English_United States.utf8
  LC_NUMERIC=C                          
  LC_TIME=English_United States.utf8    

time zone: America/New_York
tzcode source: internal

Package version:
  askpass_1.2.0       backports_1.4.1     base64enc_0.1.3    
  bigD_0.2.0          bit_4.0.5           bit64_4.0.5        
  bitops_1.0.7        blob_1.2.4          brio_1.1.3         
  broom_1.0.5         bslib_0.5.1         cachem_1.0.8       
  callr_3.7.3         cellranger_1.1.0    cli_3.6.1          
  clipr_0.8.0         colorspace_2.1-0    commonmark_1.9.0   
  compiler_4.3.1      conflicted_1.2.0    cpp11_0.4.6        
  crayon_1.5.2        curl_5.1.0          data.table_1.14.8  
  DBI_1.1.3           dbplyr_2.4.0        desc_1.4.2         
  diffobj_0.3.5       digest_0.6.33       dplyr_1.1.3        
  dtplyr_1.3.1        ellipsis_0.3.2      epibasix_1.5       
  evaluate_0.22       exact2x2_1.6.8      exactci_1.4-4      
  fansi_1.0.5         farver_2.1.1        fastmap_1.1.1      
  fontawesome_0.5.2   forcats_1.0.0       fs_1.6.3           
  gargle_1.5.2        generics_0.1.3      ggplot2_3.4.4      
  glue_1.6.2          googledrive_2.1.1   googlesheets4_1.1.1
  graphics_4.3.1      grDevices_4.3.1     grid_4.3.1         
  gridExtra_2.3       gt_0.10.0           gtable_0.3.4       
  haven_2.5.3         here_1.0.1          highr_0.10         
  hms_1.1.3           htmltools_0.5.6.1   htmlwidgets_1.6.2  
  httr_1.4.7          ids_1.0.1           isoband_0.2.7      
  janitor_2.2.0       jquerylib_0.1.4     jsonlite_1.8.7     
  juicyjuice_0.1.0    knitr_1.44          labeling_0.4.3     
  lattice_0.21.8      lifecycle_1.0.3     lubridate_1.9.3    
  magrittr_2.0.3      markdown_1.11       MASS_7.3.60        
  Matrix_1.6.1.1      memoise_2.0.1       methods_4.3.1      
  mgcv_1.8.42         mime_0.12           modelr_0.1.11      
  munsell_0.5.0       naniar_1.0.0        nlme_3.1.162       
  norm_1.0.11.1       openssl_2.1.1       parallel_4.3.1     
  pillar_1.9.0        pkgbuild_1.4.2      pkgconfig_2.0.3    
  pkgload_1.3.3       plyr_1.8.9          praise_1.0.0       
  prettyunits_1.2.0   processx_3.8.2      progress_1.2.2     
  ps_1.7.5            purrr_1.0.2         R6_2.5.1           
  ragg_1.2.6          rappdirs_0.3.3      RColorBrewer_1.1.3 
  Rcpp_1.0.11         reactable_0.4.4     reactR_0.5.0       
  readr_2.1.4         readxl_1.4.3        rematch_2.0.0      
  rematch2_2.1.2      reprex_2.0.2        rlang_1.1.1        
  rmarkdown_2.25      rprojroot_2.0.3     rstudioapi_0.15.0  
  rvest_1.0.3         sass_0.4.7          scales_1.2.1       
  selectr_0.4.2       snakecase_0.11.1    splines_4.3.1      
  ssanv_1.1           stats_4.3.1         stringi_1.7.12     
  stringr_1.5.0       sys_3.4.2           systemfonts_1.0.5  
  testthat_3.2.0      textshaping_0.3.7   tibble_3.2.1       
  tidyr_1.3.0         tidyselect_1.2.0    tidyverse_2.0.0    
  timechange_0.2.0    tinytex_0.48        tools_4.3.1        
  tzdb_0.4.0          UpSetR_1.4.0        utf8_1.2.3         
  utils_4.3.1         uuid_1.1.1          V8_4.4.0           
  vctrs_0.6.4         viridis_0.6.4       viridisLite_0.4.2  
  visdat_0.6.0        vroom_1.6.4         waldo_0.5.1        
  withr_2.5.1         xfun_0.40           xml2_1.3.5         
  yaml_2.3.7